AITopics | predictive variance

Collaborating Authors

predictive variance

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Optimality of Sub-network Laplace Approximations: New Results and Methods

Raha, Swarnali, Khare, Kshitij, Patra, Rohit K

arXiv.org Machine LearningMay-12-2026

Although the Laplace approximation offers a simple route to uncertainty quantification in deep neural networks, its reliance on inverting large Hessian matrices has motivated a range of computationally feasible low-dimensional or sparse approximations. A prominent class of such methods - sub-network Laplace approximations, constructs surrogates by restricting attention to a small subset of parameters. Existing approaches in this family typically rely on diagonal, layer-wise, or other architectural heuristics for subset selection, which ignore cross-parameter interactions and lack formal optimality guarantees. In this paper, we provide a rigorous theoretical analysis of the sub-network Laplace paradigm. We prove that all sub-network Laplace methods systematically underestimate the predictive variance of the full Laplace posterior, and that this bias decreases monotonically as the retained sub-matrix expands. Leveraging this insight, we propose two principled, analytically grounded sub-network Hessian approximations: \textit{Gradient-Laplace} selects parameters with the largest average squared gradients of the model output with respect to the parameters over a reference dataset; while \textit{Greedy-Laplace} iteratively refines this selection by accounting for off-diagonal interactions in the precision matrix. We establish theoretical guarantees characterizing their optimality properties and show that Gradient-Laplace provably outperforms existing heuristic approaches. Extensive numerical studies across diverse settings indicate that these methods perform strongly relative to existing benchmarks.

approximation, artificial intelligence, machine learning, (19 more...)

arXiv.org Machine Learning

2605.09075

Country: North America > United States (0.46)

Genre: Research Report > New Finding (0.82)

Industry: Transportation (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback

Ensemble-Based Dirichlet Modeling for Predictive Uncertainty and Selective Classification

Franzen, Courtney, Pourkamali-Anaraki, Farhad

arXiv.org Machine LearningApr-8-2026

Neural network classifiers trained with cross-entropy loss achieve strong predictive accuracy but lack the capability to provide inherent predictive uncertainty estimates, thus requiring external techniques to obtain these estimates. In addition, softmax scores for the true class can vary substantially across independent training runs, which limits the reliability of uncertainty-based decisions in downstream tasks. Evidential Deep Learning aims to address these limitations by producing uncertainty estimates in a single pass, but evidential training is highly sensitive to design choices including loss formulation, prior regularization, and activation functions. Therefore, this work introduces an alternative Dirichlet parameter estimation strategy by applying a method of moments estimator to ensembles of softmax outputs, with an optional maximum-likelihood refinement step. This ensemble-based construction decouples uncertainty estimation from the fragile evidential loss design while also mitigating the variability of single-run cross-entropy training, producing explicit Dirichlet predictive distributions. Across multiple datasets, we show that the improved stability and predictive uncertainty behavior of these ensemble-derived Dirichlet estimates translate into stronger performance in downstream uncertainty-guided applications such as prediction confidence scoring and selective classification.

artificial intelligence, machine learning, prediction, (18 more...)

arXiv.org Machine Learning

2604.06032

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > Colorado > Denver County > Denver (0.04)
North America > United States > New York > New York County > New York City (0.04)
Asia (0.04)

Genre: Research Report > New Finding (0.93)

Industry: Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

SupplementaryMaterial

Neural Information Processing SystemsFeb-11-2026, 19:37:35 GMT

Letπ0( |s)beaGaussianbehavioral reference policy with meanµ0(s) and variance σ20(s), and let π( |s) be an online policy with reparameterization at = fφ( t;st)andrandomvector t. Whilstentropyregularization partially mitigates the collapse of predictive variance away from the expert demonstrations, we still observe the wrong trend similar to Figure 1 with predictive variances high near the expert demonstrations andlowonunseen data. AWAC performs online fine-tuning of a policy pre-trained on offline. Themethod requires additional off-policy data to be generated to saturate the replay buffer, thereby requiring ahidden number ofenvironment interactions that donotinvolvelearning. To mitigate this, in practice, BRAC adds an entropy bonus to the supervised learning objective which stabilizes the variance around the training set but has no guarantees away from thedata.

artificial intelligence, machine learning, offline data, (13 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

eecca5b6365d9607ee5a9d336962c534-Paper.pdf

Neural Information Processing SystemsFeb-11-2026, 19:37:31 GMT

behavioral policy, behavioral reference policy, predictive variance, (12 more...)

Neural Information Processing Systems

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.05)
North America > United States > Wisconsin > Dane County > Madison (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Education > Educational Setting > Online (0.32)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Robots (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.94)

Add feedback

c0d5a28eb3949efbedbe3e41751e3ffc-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-11-2026, 17:26:56 GMT

algorithm, experiment, international conference, (13 more...)

Neural Information Processing Systems

Country:

Oceania > Australia > New South Wales > Sydney (0.05)
North America > Canada > Quebec > Montreal (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(2 more...)

Genre: Research Report (0.47)

Industry: Health & Medicine (0.95)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

01ce84968c6969bdd5d51c5eeaa3946a-AuthorFeedback.pdf

Neural Information Processing SystemsFeb-11-2026, 07:43:49 GMT

engineering effort, gps, sparse gps, (16 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.31)

Add feedback

2 Neuralnetworkensemblesandtheirrelationstokernels

Neural Information Processing SystemsFeb-11-2026, 02:36:22 GMT

Although the ongoing success of deep learning is remarkable, the increasing data, model and training algorithm complexity makeathorough understanding oftheir inner workings increasingly difficult.

artificial intelligence, ensemble, machine learning, (17 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

sup

Neural Information Processing SystemsFeb-8-2026, 11:24:34 GMT

C.1 2DSyntheticBenchmark For both benchmarks, we sample 500 observationsxi=(x1i,x2i)from each of the twoin-domain classes (orange and blue), and consider a deep architecture ResFFN-12-128, which contains 12 residual feedforward layers with 128 hidden units and dropout rate 0.01.

artificial intelligence, machine learning, xind, (17 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback

Multi-level Monte Carlo Dropout for Efficient Uncertainty Quantification

Pim, Aaron, Pryer, Tristan

arXiv.org Machine LearningJan-21-2026

We develop a multilevel Monte Carlo (MLMC) framework for uncertainty quantification with Monte Carlo dropout. Treating dropout masks as a source of epistemic randomness, we define a fidelity hierarchy by the number of stochastic forward passes used to estimate predictive moments. We construct coupled coarse--fine estimators by reusing dropout masks across fidelities, yielding telescoping MLMC estimators for both predictive means and predictive variances that remain unbiased for the corresponding dropout-induced quantities while reducing sampling variance at fixed evaluation budget. We derive explicit bias, variance and effective cost expressions, together with sample-allocation rules across levels. Numerical experiments on forward and inverse PINNs--Uzawa benchmarks confirm the predicted variance rates and demonstrate efficiency gains over single-level MC-dropout at matched cost.

artificial intelligence, estimator, machine learning, (19 more...)

arXiv.org Machine Learning

2601.13272

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Filters

Collaborating Authors

predictive variance

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Optimality of Sub-network Laplace Approximations: New Results and Methods

Ensemble-Based Dirichlet Modeling for Predictive Uncertainty and Selective Classification

SupplementaryMaterial

eecca5b6365d9607ee5a9d336962c534-Paper.pdf

c0d5a28eb3949efbedbe3e41751e3ffc-Supplemental-Conference.pdf

01ce84968c6969bdd5d51c5eeaa3946a-AuthorFeedback.pdf

f5e62af885293cf4d511ceef31e61c80-Paper.pdf

2 Neuralnetworkensemblesandtheirrelationstokernels

sup

Multi-level Monte Carlo Dropout for Efficient Uncertainty Quantification